WUST EN-CS Crosslink System at NTCIR-9 CLLD Task
نویسندگان
چکیده
This paper describes our work in NTCIR-9 on the task of Cross-Lingual Link Discovery (Crosslink/CLLD). The work mainly focuses on two aspects to accomplish this task: (1) How to collect useful data for Crosslink and (2) How to use the data correctly and effectively. The system firstly uses online data collecting and text mining in Chinese Wikipedia articles to build the basic Crosslink database. And then these data and two-way expansion algorithm will be applied to identify the anchors and find out the relevant corresponding matchers.
منابع مشابه
Overview of the NTCIR-9 Crosslink Task: Cross-lingual Link Discovery
This paper presents an overview of NTCIR-9 Cross-lingual Link Discovery (Crosslink) task. The overview includes: the motivation of cross-lingual link discovery; the Crosslink task definition; the run submission specification; the assessment and evaluation framework; the evaluation metrics; and the evaluation results of submitted runs. Cross-lingual link discovery (CLLD) is a way of automaticall...
متن کاملSimple Yet Effective Methods for Cross-Lingual Link Discovery (CLLD) - KMI @ NTCIR-10 CrossLink-2
Cross-Lingual Link Discovery (CLLD) aims to automatically find links between documents written in different languages. In this paper, we first present a relatively simple yet effective methods for CLLD in Wiki collections, explaining the findings that motivated their design. Our methods (team KMI) achieved in the NTCIR-10 CrossLink-2 evaluation the best overall results in the English to Chinese...
متن کاملThe Effectiveness of Cross-lingual Link Discovery
This paper describes the evaluation in benchmarking the effectiveness of cross-lingual link discovery (CLLD). Cross-lingual link discovery is a way of automatically finding prospective links between documents in different languages, which is particularly helpful for knowledge discovery of different language domains. A CLLD evaluation framework is proposed for system performance benchmarking. Th...
متن کاملRDLL at CrossLink Anchor Extraction Considering Ambiguity in CLLD
In this paper, we describe our work in NTCIR-10 on the task of cross-lingual link discovery (CLLD). Our proposed method is focused mainly on two aspects in order to accomplish this task: how to find important anchors from an original article in order to crosslink and how to find the correct links to articles in the target language for the original articles. The system first uses online data col...
متن کاملIISR Crosslink Approach at NTCIR 9 CLLD Task
In this paper, we describe our approach to the English-Korean Cross-Lingual Link Discovery (CLLD) task in NTCIR 9. We propose a simple and effective approach to discover the links. Our method comprises preprocessing steps, anchor-target link mapping, and the ranking steps. For discovering the links, we use the English anchor names, the inter-language links, and the translation by the Google Tra...
متن کامل